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Abstract 

We  address  a  problem  of  simultaneous  quality  and  quantity  control  motivated  by 
semiconductor  manufacturing.  After  wafers  are  fabricated,  they  are  probed,  or  electrically 
tested,  ajid  in  some  cases  the  probing  facility  is  the  bottleneck  for  the  entire  IC  manufac- 
turing process.  Under  this  assumption,  we  consider  the  problem  of  choosing  the  optimal 
start  rate  of  lots  of  wafers  into  the  fabrication  facility  and  the  optimal  screening  policy  in 
front  of  the  probing  facility  to  maximize  the  expected  profit,  which  is  the  revenue  from 
good  chips  minus  the  variable  fabrication  and  probing  costs.  The  screening  policy  decides 
which  wafers  to  discard  and  which  wafers  to  probe.  These  decisions  axe  subject  to  capac- 
ity constraints  at  both  the  wafer  fabrication  and  probing  facihties.  An  empirical  Bayes 
approach  is  employed:  the  number  of  bad  chips  on  a  wafer  is  assumed  to  be  a  gamma 
random  variable,  where  the  scaie  parameter  is  unknown  and  varies  from  lot  to  lot  accord- 
ing to  another  gamma  distribution.  We  fit  the  yield  model  to  industrial  data  and  test  the 
optimcd  policy  on  this  data. 
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This  paper  and  its  companion  (Longtin  et  al.  1992)  address  a  particular  quality 
management  issue  in  semiconductor  manufacturing.  The  production  of  integrated  circuits 
consists  of  four  main  stages:  wafer  fabrication,  probing,  packaging  and  final  testing,  and 
we  will  focus  on  the  interrelationship  between  wafer  fabrication  and  probing.  In  wafer 
fabrication,  disc-like  wafers  that  contain  hundreds  of  integrated  circuits,  or  chips,  are 
produced  (in  batches,  or  lots,  of  usually  20  to  50  wafers)  by  a  very  long  and  complex 
procedure  involving  hundreds  of  operations.  After  fabrication  is  completed,  each  chip 
on  a  wafer  is  probed,  or  electrically  tested,  to  distinguish  between  defective  chips  and 
good  chips.  Each  wafer  is  then  separated  into  its  respective  chips,  and  nondefective 
chips  are  covered  in  a  protective  plastic  during  packaging.  In  final  testing,  the  chips  are 
functionally  tested  under  a  variety  of  environmental  conditions  before  being  shipped  to 
customers. 

Problem  Description.  To  motivate  our  model  formulation,  the  process  economics 
and  the  key  material  flow  issues  need  to  to  be  briefly  described.  Building  a  wafer  fabrica- 
tion faciUty,  or  fab,  costs  hundreds  of  millions  of  dollars,  and  consequently,  fab  managers 
are  very  concerned  with  maintaining  high  utilization  of  the  bottleneck  equipment,  and 
one  of  the  biggest  operational  decisions  for  the  fab  manager  is  to  determine  the  start 
rate  of  wafers  into  the  fab.  Because  of  the  huge  amount  of  statistical  variabihty  in  the 
fab  (due  primarily  to  random  yield,  rework,  and  tool  failures;  see  Chen  et  al.  1988  for 
details),  pushing  the  start  rate  beyond  a  certain  level,  which  we  call  the  fab's  effective 
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capacity,  will  result  in  unacceptably  high  levels  of  work-in-process  inventory  and  long 
lead  times. 

Despite  the  well-documented  congestion  that  occurs  in  wafer  fabrication,  we  have 
visited  several  facilities  where  the  probing  (we  will  often  use  the  more  generic  term  testing 
rather  than  probing)  facility,  not  the  wafer  fab,  is  the  bottleneck  that  determines  the 
production  capacity  of  finished  goods.  There  are  several  reasons  for  this  phenomenon:  the 
testing  equipment  is  very  expensive  (machines  can  cost  three  to  four  million  dollars)  and 
the  testing  procedure  is  a  very  time  consuming  and  labor  intensive  process.  Furthermore, 
testing  capacity  is  sometimes  labor  constrained  because  companies  are  either  unable  or 
unwilling  to  hire  and  train  full-time  employees  in  the  face  of  uncertain  future  demand. 

Although  relative  costs  and  revenues  depend  greatly  on  the  type  of  market  (e.g., 
commodity  or  custom  chips)  and  other  factors,  the  variable  testing  cost  per  wafer  is 
typically  only  several  percent  of  total  variable  production  cost,  and  the  revenue  from  a 
wafer  of  nondefective  chips  is  roughly  ten  times  the  variable  production  cost  per  wafer. 
Also,  the  yield  in  wafer  fabrication,  which  is  the  fraction  of  chips  that  are  good,  can  be 
very  low  and  erratic.  Since  many  facilities  are  capacity  constrained  rather  than  market 
constrained,  they  can  sell  anything  they  make,  and  any  increase  in  yield  leads  directly 
to  an  increase  in  profit.  Consequently,  yield  dominates  the  economics  of  the  process  and 
is  the  primary  concern  of  fab  managers. 

Semiconductor  manufacturers  typically  use  an  exhaustive  testing  policy;  that  is,  every 
chip  of  every  wafer  is  tested  and  is  deemed  defective  or  nondefective.  Indeed,  the  thought 
of  simply  discarding  a  completed  chip  before  it  undergoes  testing  (unless  it  represents 
"leftovers"  from  a  custom  order  that  has  already  been  filled)  goes  very  much  against  the 
grain  of  mainstream  industry  thinking.  In  contrast,  our  paper  and  Longtin  et  al.  axe 
based  on  the  following  simple  premise  that  has  also  been  put  forth  in  Goldratt  and  Cox 
(1984):  profitability  can  be  increased  by  preventing  bottleneck  equipment  from  working 
on  products  that  are  already  defective.  In  particular,  if  testing  is  the  bottleneck  operation 
under  an  exhaustive  testing  policy,  then  semiconductor  manufacturers  can  increase  their 


profits  by  simultaneously  (1)  employing  a  sequential  screening  procedure  that  adaplively 
discards,  rather  than  tests,  portions  of  wafers  (or  entire  wafers  or  even  entire  lots)  that  are 
thought  to  have  a  sufficiently  low  proportion  of  nondefective  chips,  and  (2)  increasing  the 
start  rate  of  wafers.  Of  course,  if  the  rate  of  wafer  starts  is  increased,  so  is  the  congestion 
in  the  fab  and  the  production  costs,  and  these  two  factors  need  to  be  taken  into  account. 

To  test  this  premise,  we  consider  the  following  problem  of  simultaneous  quality  and 
quantity  control:  determine  the  start  rate  of  lots  of  wafers  into  the  fab  and  find  a  sequen- 
tial screening  policy  for  the  testing  facility  to  maximize  the  expected  long  run  average 
revenue  from  nondefective  chips  minus  the  variable  fabrication  and  testing  costs  of  wafers. 
The  two  controls  are  subject  to  constraints  on  the  average  effective  capacity  of  both  the 
fab  and  the  testing  station;  we  assume  that  the  testing  capacity  constraint  is  more  re- 
strictive than  the  fab  constraint  when  an  exhaustive  testing  policy  is  in  use.  To  minimize 
confusion,  a  screening  procedure  in  isolation  will  be  referred  to  as  a  policy  and  a  screening 
pohcy  coupled  with  a  start  rate  will  be  referred  to  as  a  strategy. 

In  practice,  the  resulting  increase  in  profit  that  an  optimal  strategy  will  achieve  rela- 
tive to  the  exhaustive  testing  strategy  commonly  used  in  industry  (that  is,  an  exhaustive 
testing  policy  with  a  start  rate  that  keeps  the  testing  facility  working  at  its  effective 
capacity)  depends  greatly  on  two  factors  that  will  be  discussed  below:  (1)  the  relative 
congestion  levels  of  the  fab  and  the  testing  facility  under  the  exhaustive  testing  policy, 
and  (2)  the  nature  of  the  yield  variability.  Indeed,  if  the  fab  was  more  highly  congested 
than  the  testing  facility  under  an  exhaustive  testing  policy,  then  this  policy  might  be 
optimal,  and  hence  sequential  screening  would  be  of  no  value.  However,  testing  is  also 
performed  after  various  key  operations  in  the  fab,  and  often  (for  example,  see  the  simula- 
tion models  of  Atherton  and  Dayhoff  1985,  Glassey  and  Resende  1988  and  Wein  1988)  the 
bottleneck  workstation  in  the  fab  is  the  photolithography  workstation,  to  which  wafers 
make  many  (up  to  twenty)  visits  during  their  processing.  Thus,  the  framework  presented 
here  can  also  be  used  to  perform  sequential  screening  at  key  tests  in  the  fab.  That  is,  the 
start  rate  of  wafers  can  be  increased  and  undesirable  wafers  or  chips  can  be  discarded  at 


in-fab  tests  so  that  the  bottleneck  equipment  works  on  higher  quality  chips.  However, 
this  procedure  may  not  be  as  effective  in  the  fab  as  it  is  at  probe,  because  type  I  and 
type  II  errors  are  apt  to  be  more  prevalent  in  the  fab.  In  particular,  in-fab  testing  is 
often  visual  and  is  not  as  discerning  as  electrical  testing,  and  a  chip  that  is  correctly 
found  to  be  nondefective  at  an  in-fab  test  may  become  defective  before  its  next  visit 
to  the  bottleneck  workstation.  We  will  hereafter  assume  that  testing  is  the  bottleneck 
operation,  and  more  specifically,  our  numerical  studies  here  and  in  Longtin  et  al.  assume 
that  the  fab  is  at  90%  of  its  effective  capacity  when  the  exhaustive  testing  strategy  is 
employed. 

Yield  Modeling.  We  now  discuss  the  nature  of  the  yield  variability.  Low  yield 
in  wafer  fabrication  occurs  for  a  variety  of  reasons,  including  short  product  life  cycles, 
particulate  contamination  (see  Osburn  et  al.  1988),  misalignment  of  operations,  and 
chemical  imbalances.  Also,  defective  chips  are  difficult  to  detect  visually,  and  the  industry 
relies  heavily  on  the  probing  machines.  Intuitively,  sequential  screening  will  only  be 
effective  if  dependencies  and/or  nonuniformities  in  yield  can  be  identified  and  exploited. 
After  all,  if  every  chip  processed  by  the  fab  had  the  same  probability  of  being  defective, 
independently  of  all  other  chips,  then  sequential  screening  would  be  fruitless.  However, 
several  types  of  dependencies  do  exist  and,  indeed,  one  of  our  primary  goals  in  this  pair  of 
papers  is  to  analyze  industrial  data  and  determine  which  dependencies  are  most  prevalent 
and  easiest  to  exploit. 

Recall  that  chips  are  produced  on  wafers,  and  wafers  travel  through  the  fab  in  lots. 
Dependencies  may  be  present  at  all  three  levels  (lots,  wafers,  chips),  including 

(1)  dependence  across  consecutive  lots:  the  yield  of  consecutive  lots  may  be  positively 
correlated  because  of  machines  that  go  in  and  out  of  control,  or  batch  operations,  such 
as  diffusion  or  oxidation,  that  simultaneously  process  multiple  lots; 

(2)  nonuniformity  in  chip  type:  some  chip  types  may  be  inherently  easier  to  produce 
than  others; 


(3)  dependence  of  wafers  within  a  lot:  positive  serial  correlation  of  wafer  yields  within 
a  lot  may  be  due  to  operations  that  simultaneously  process  one  or  more  lots  of  wafers, 
or  to  wafer- by- wafer  operations  that  incur  a  joint  set-up  for  an  entire  lot; 

(4)  dependence  of  neighboring  chip  locations  on  a  wafer,  defective  chips  are  often 
found  in  clusters  (see  Mallory  et  al.  1983  for  empirical  data),  which  may  be  due  to 
processing  or  particulate  contamination; 

(5)  radial  nonuniformity  on  a  wafer:  handling  and  processing  can  cause  a  donut- 
shaped  yield  with  more  defective  chips  on  the  edge  of  the  wafer  and.  to  a  lesser  extent, 
in  the  center  of  the  wafer  (see  Ferris-Prabhu  et  al.   1987  for  empirical  data):  and 

(6)  dependence  of  a  chip  location  across  wafers  within  a  lot:  mask  defects  and  batch 
operations  can  cause  the  yield  of  a  chip  location  to  be  positively  correlated  across  con- 
secutive wafers. 

Furthermore,  sequential  screening  can  be  performed  at  all  three  levels:  we  can  discard 
(i)  entire  lots  of  wafers  based  on  the  yield  from  previous  lots,  (ii)  wafers  in  a  lot  based 
on  the  yield  of  previously  tested  wafers  from  the  same  lot.  or  (iii)  chips  on  a  wafer  based 
on  the  yield  of  previously  tested  chips.  We  do  not  pursue  screening  of  type  (i)  because 
dependency  (1)  is  not  very  prevalent  in  wafer  fabrication;  since  a  wafer  fab  is  far  from  a 
flow  line  operation,  lots  that  are  processed  together  in  the  same  oven  during  a  particular 
batch  operation  tend  to  go  their  separate  ways  and  do  not  arrive  together  at  the  testing 
facihty.  Also,  all  the  industrial  data  sets  that  we  analyze  contain  lots  of  only  one  chip 
type,  and  hence  nonuniformity  (2)  will  not  be  addressed.  However,  this  factor  could  be 
addressed  in  our  framework  by  developing  a  different  yield  model  for  each  type  of  chip. 

Our  two  studies  employ  sequential  screening  of  types  (ii)  and  (iii)  to  exploit  depen- 
dencies (3)-(6).  The  factors  underlying  dependency  (3),  coupled  with  the  high  degree  of 
randomness  in  the  production  process,  lead  to  a  significant  amount  of  lot-to-lot  variabil- 
ity in  mean  yield.  In  this  paper,  sequential  screening  of  type  (ii)  is  employed  to  exploit 
lot-to-lot  variabiUty.  Our  yield  model  assumes  that  the  number  of  defective  chips  on  each 


wafer  in  a  given  lot  is  an  independent  gamma  random  variable  with  shape  parameter  q 
and  scale  parameter  fi.  An  empirical  Bayes  approach  is  used,  where  the  scale  parameter 
(3  is  unknown  and  varies  from  lot  to  lot;  for  each  lot,  the  parameter  3  is  chosen  indepen- 
dently from  a  (different)  gamma  prior  distribution.  A  sequential  screening  policy  in  this 
setting  decides  when  to  discard  the  remaining  wafers  in  a  lot. 

Industrial  data  from  Bohn  (1991)  is  used  to  estimate  the  parameters  for  the  yield 
model  in  this  paper.  Since  the  primitive  empirical  data  in  Bohn  is  the  number  of  non- 
defective  chips  on  each  wafer,  this  data  cannot  be  used  to  analyze  the  more  detailed 
spatial  and  temporal  dependencies  described  in  (4)-(6).  Longtin  et  al.  analyze  over  300 
wafer  maps  (see  the  Appendix  of  that  paper  for  some  e.xamples)  from  two  wafer  fabs. 
and  model  the  chip  yield  by  a  Markov  random  field,  which  is  a  stochastic  model  that 
allows  the  probability  of  a  chip  being  nondefective  to  depend  on  the  resulting  yield  of  the 
neighboring  chips.  A  variety  of  sequential  screening  strategies  of  type  (iii)  are  proposed 
that  discard  individual  chips  on  a  wafer.  In  summary,  the  present  paper  employs  sequen- 
tial screening  at  the  wafer  level  to  exploit  lot-to-lot  variability  and  Longtin  et  al.  employ 
sequential  screening  at  the  chip  level  to  exploit  detailed  spatial  dependencies  within  a 
lot. 

The  two  key  aspects  of  yield  modeling  that  we  focus  on,  lot-to-lot  variablity  and 
spatial  dependence  on  and  across  wafers,  have  received  very  little  attention  in  the  IC  yield 
modeling  hterature.  We  know  of  no  models  capturing  the  former  aspect  and  Flack  (1985) 
appears  to  contain  the  only  yield  model  that  explicitly  accounts  for  spatial  dependence 
of  chips  on  a  wafer.  Nearly  all  the  existing  yield  literature  (see  Cunningham  1990  for  a 
recent  survey)  calculates  the  proportion  of  nondefective  chips  on  a  wafer  by  considering 
the  chip  axea  and  density  of  point  defects  on  the  wafer.  These  derivations  lead  to  a  two 
parameter  distribution  (the  negative  binomial  distribution,  which  describes  a  Poisson 
random  variable  mixed  with  a  gamma,  appears  to  be  the  most  effective)  that  can  be 
fitted  to  the  mean  and  variance  of  the  empirical  data  for  the  number  of  nondefective 
chips  per  wafer.  According  to  Cunningham,  the  goal  of  most  of  the  chip  yield  modeling 


research  has  been  to  predict  costs  and  actual  yields,  and  to  determine  the  appropriate 
level  of  circuit  integration.  Albin  and  Friedman's  (1989)  work  on  acceptance  sampling 
appears  to  be  the  first  to  employ  a  yield  model  in  a  quality  control  context:  they  use  a  two 
parameter  distribution  (the  N'eyman  type  A.  which  is  a  Poisson  compounded  Poisson)  to 
model  the  number  of  defective  chips  on  a  wafer.  Because  they  were  interested  in  quality 
control  issues  rather  than  circuit  design  issues,  they  directly  modeled  the  yield  without 
resorting  to  the  defect  density  and  chip  area,  and  we  do  the  same  in  this  pair  of  papers. 

Summary  of  Results.  The  optimization  problem  addressed  in  this  paper  is  essen- 
tially an  optimal  stopping  problem  embedded  within  a  mathematical  program,  and  the 
optimal  solution  is  determined  numerically  by  solving  a  series  of  parameterized  optimal 
stopping  problems.  Since  the  optimal  strategy  is  difficult  to  calculate,  we  also  find  the 
optimal  fixed  sample  size  strategy,  where  a  fixed  number  of  wafers  from  each  lot  is  tested, 
after  which  the  controller  either  discards  or  tests  all  the  remaining  wafers  in  a  lot.  Five 
of  Bohn's  industrial  data  sets  are  used  to  estimate  the  parameters  of  the  yield  model, 
and  the  two  proposed  policies  are  derived  for  all  five  data  sets.  For  our  parameter  values, 
the  maximum  possible  profit  increase  that  an  optimal  strategy  can  achieve  relative  to 
the  exhaustive  strategy  commonly  used  in  industry  is  between  11.1%  and  12%;  the  exact 
upper  bound  cannot  be  mentioned  without  revealing  the  true  yield  of  Bohn's  wafers. 
The  fixed  sample  size  strategy  and  the  optimal  strategy  achieve  a  2.2%  and  2.5%  average 
profit  increase  over  the  five  data  sets,  respectively. 

These  two  strategies  are  also  tested  on  the  actual  data  in  a  simulation  study.  By 
randomly  shuffling  the  wafers  in  a  lot,  100  lots  of  wafers  are  generated  from  each  lot  in 
the  five  data  sets.  If  the  yield  model  underestimates  the  average  number  of  discarded 
wafers  per  lot  in  the  simulation  study,  then  the  testing  facility  will  be  underutilized  and  a 
suboptimal  strategy  is  obtained.  If  the  yield  model  overestimates  the  average  number  of 
discarded  wafers,  then  the  testing  facility  will  be  overutilized,  and  an  infeasible  solution 
can  result.  Under  the  fixed  sample  size  strategy,  the  model  accurately  predicts  the  average 
number  of  discarded  wafers  per  lot,  and  an  average  profit  increase  of  1.2%  is  achieved. 


Under  the  optimal  strategy,  the  model  underestimates  the  average  number  of  discarded 
wafers  by  an  average  of  2.5%  over  the  five  data  sets,  which  results  in  an  average  profit 
decrease  of  0.7%. 

In  summary,  the  fixed  sample  size  strategy  may  be  preferable  to  the  optimal  strategy, 
since  it  is  much  easier  to  derive  and  to  implement,  it  performs  nearly  as  well  on  the 
analytical  model,  and  appears  to  be  more  robust  when  faced  with  the  actual  data  sets. 
We  believe  that  the  discrepancy  between  the  theoretical  results  and  the  simulation  results 
is  due  primarily  to  the  assumption  that  all  lots  in  the  same  data  set  have  the  same  shape 
parameter  q.  Hence,  a  relaxation  of  this  assumption  is  probably  required  to  obtain  a 
more  accurate  estimate  of  the  average  number  of  discarded  chips  per  lot,  which  should 
lead  to  a  more  effective  and  reliable  strategy.  The  profit  increases  reported  here  are 
relatively  small  and,  in  particular,  are  significantly  smaller  the  the  increases  achieved  by 
screening  at  the  chip  level  in  Longtin  et  al.  However,  readers  should  keep  in  mind  that 
a  1%  increase  in  revenue  minus  variable  cost  can  represent  millions  of  dollars  annually. 
Also,  since  the  fixed  cost  component  is  so  large  in  this  industry,  a  1%  improvement  here 
would  translate  into  a  much  larger  percentage  improvement  in  a  company's  reported 
profits. 

The  remainder  of  the  paper  is  organized  as  follows.  In  Section  I,  the  yield  model  is 
described  in  detail,  and  the  modeling  assumptions  are  compared  with  the  conclusions  of 
Bohn's  empirical  study.  The  stochastic  optimization  problem  is  formulated  in  Section  2. 
The  optimal  fixed  sample  size  screening  strategy  is  found  in  Section  3,  and  the  optimal 
sequential  screening  strategy  is  derived  in  Section  4.  Numerical  results  are  reported  in 
Section  5.  Concluding  remarks  on  this  paper  and  Longtin  et  al.  can  be  found  in  Section 
7  of  the  latter  paper. 

1.  The  Yield  Model  and  Industrial  Data 

Our  yield  model  assumes  that  the  number  of  defective  chips  on  each  wafer  in  a  given  lot 


is  an  independent  gamma  random  variable  with  shape  parameter  q  and  scale  parameter 
J.  An  empirical  Bayes  approach  is  used,  where  the  shape  parameter  q  is  the  same 
for  all  lots  but  the  scale  parameter  J  varies  from  lot  to  lot:  for  each  lot.  the  value  of 
the  parameter  J  is  chosen  independently  from  a  gamma  prior  distribution  with  known 
parameters  a  and  h.  The  two  gamma  distributions  form  a  conjugate  pair,  if  the  parameter 
J  has  a  gamma  ia.b)  distribution  prior  to  testing  a  wafer,  and  if  .r  chips  on  the  wafer 
are  found  to  be  defective,  then  3  has  a  gamma  (a  +  a,b  +  x)  posterior  distribution. 


Figure  1  Die  Yields  of  11  Batches  from  Factory  CI 

1      1      i      1      1      1      1      1      i      1      1 

HO 

.     1:     . 

S 

-    .          .     .     li    P    r    1 

^     «» 

^ 

!  f  }.  u  -  •  '  -  ;.  r  • 

.2 

'•air 

t                 •                            : 

M 

—.         «            •                        ' 

5 

•                                                                                      • 

.S      » 

»      •                  "■                                   •                                    •       - 

>> 

• 

■                                             • 

a 

0 

•                         • 

Batch  number  (Successive  batches) 

Figure  1.  Figure  1  of  Bohn. 

In  Section  5,  maiximum  likelihood  estimation  is  employed  to  estimate  the  param- 
eters a,  a  and  b  from  the  industrial  yield  data  collected  by  Bohn.  He  analyzed  11 
different  sets  of  data  from  five  factories,  and  we  analyze  five  of  these  data  sets  (sets 
C1,C1.5,C2,C2.5,C3),  which  are  all  from  the  same  factory.  Each  data  set  consisted  of 
about  ten  lots,  or  batches,  of  wafers  of  the  same  product  that  were  completed  during  the 
same  week.    To  disguise  the  actual  yield,  Bohn  presented  the  raw  data  as  the  number 


of  good  chips,  or  die.  on  each  wafer  of  each  lot.  In  Figure  1.  we  reproduce  Figure  1  of 
Bohn.  which  displays  a  summary  of  data  set  Cl.  Each  column  in  Figure  1  corresponds 
to  a  lot  of  wafers  and  each  point  represents  the  number  of  good  chips  on  a  particular 
wafer;  hence.  Figure  1  essentially  contains  11  yield  histograms,  one  for  each  lot.  Bohn 
Ccime  to  the  following  three  conclusions  concerning  his  11  data  sets:  (i)  the  mean  yield 
of  each  lot  varies  considerably  from  lot  to  lot  (e.g..  compare  the  last  two  lots):  (ii)  the 
within  lot  variabihty  (i.e..  the  vertical  spread  of  points  in  each  column)  is  high:  and  (iii) 
there  is  a  high  variation  between  lots  of  within  lot  variability  (e.g..  compare  the  second 
and  seventh  lots). 

The  gamma-gamma  model  certainly  captures  the  lot-to-lot  variabihty  in  mean  yield. 
However,  plenty  of  other  conjugate  pairs  would  also  capture  this  effect.  In  fact,  before 
considering  the  gamma-gamma  pair,  we  performed  our  entire  analysis  using  the  beta- 
binomial  pair  and  the  gamma-Poisson  pair:  the  beta-binomial  model,  in  particular,  has 
intuitive  appeal,  since  the  number  of  bad  chips  per  wafer  is  explicitly  modeled  as  ein 
integer  between  zero  and  the  number  of  chips  on  a  wafer.  However,  the  binomial  and 
Poisson  assumptions  significantly  underestimate  the  within  lot  variabihty  of  chip  yield. 
More  specifically,  we  calculated  the  mean  and  variance  of  the  number  of  good  chips  on 
each  wafer  of  a  given  lot.  and  determined  the  variance- to- mean  ratio  for  each  lot  in 
the  five  data  sets.  The  average  ratio  over  all  53  lots  was  7.6  and  the  range  was  1.8  to 
30.3.  In  contrast,  the  corresponding  \ariance- to-mean  ratio  under  the  binomial  (Poisson. 
respectively)  assumption  is  less  than  (equal  to.  respectively)  one.  Consequently,  when 
the  controls  derived  from  these  two  \-ield  models  were  tested  on  the  actual  data,  too 
mamy  wafers  were  discarded  at  the  testing  facihty.  which  led  to  a  significant  reduction 
in  overadl  profit.  .Although  the  gamma-gamma  conjugate  pair  captures  the  substantial 
level  of  within  lot  variabihty.  it  is  unable  to  capture  the  high  variation  between  lots  of 
within  lot  variabihty.  Perhaps  a  gamma- gamma  model  in  which  both  the  shape  and 
scale  peirameters  are  unknown  would  capture  this  effect:  computationad  considerations 
prevented  us  from  pursuing  this  avenue.  In  summary,  the  gamma-gamma  model  captures 
the  effects  in  conclusion  (i)  and  (ii),  but  does  not  capture  the  effect  in  conclusion  (iii). 
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To  further  investigate  the  validity  of  our  model,  we  test  the  derived  policies  on  the  actual 
data  sets  in  Section  5. 

Finally,  readers  may  note  that  the  number  of  wafers  in  a  lot  is  not  constant  in  Figure 
1.  This  is  due  to  the  scrapping  of  entire  wafers  during  fabrication.  Hence,  we  also  assume 
that  each  wafer  in  a  lot  has  a  certain  probabihty  of  being  scrapped  during  fabrication, 
so  that  the  size  of  a  lot  exiting  the  fab  is  a  binomial  random  variable. 

2.  Problem  Formulation 

In  this  section,  we  mathematically  formulate  the  optimization  problem  described  in 
the  Introduction,  and  pictured  in  Figure  2.  Each  lot  entering  the  fab  contains  I  wafers 
and  each  wafer  consists  of  M  chips.  Each  wafer  in  a  lot  is  scrapped  during  fabrication 
with  probability  q.  and  hence  the  lot  size  /  of  a  wafer  exiting  the  fab  is  a  binomial  random 
variable  with  parameters  L  and  1  —  q.  Since  the  exact  number  of  wafers  in  a  lot  is  known 
when  the  lot  arrives  to  the  testing  faciHty.  it  is  natural  to  use  this  information  to  develop 
an  optimal  screening  policy.  However,  this  would  require  us  to  derive  a  different  optimal 
screening  pohcy  for  every  possible  value  of  /.  which  makes  the  optimal  solution  much 
more  difficult  to  compute  and  harder  to  implement  in  practice.  Instead,  we  do  not  allow 
our  screening  policy  to  differ  from  lot  to  lot.  except  for  the  obvious  constraint  that  no 
more  than  /  wafers  can  be  tested  from  a  lot  with  /  wafers.  Since  most  fabs  t\-pically  scrap 
about  5-10%  of  their  wafers,  this  assumption  should  not  lead  to  significant  degradations 
in  performance. 


scrapped 


qlA  wafers  '  week 


untested  wafer? 


X  lots/week 


(l-q)LX  wafers«'weeK 


TESTING 

Mj  wafers. week 


good  wafers 


bad  wafers 


Figure  2.  The  semiconductor  manufacturing  faciHty. 
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For  a  typical  lot  exiting  the  fab.  let  Xn  be  the  number  of  defective  chips  on  the  nth 

wafer,  for  n  =  1 /.  and  let  .s^  =  Ill=\  J^k  be  the  total  number  of  defective  chips  on 

the  first  n  wafers.    Let  the  decision  variable  u„  =  1  if  the  nth  wafer  is  to  be  tested  and 
u„  =  0  if  the  nth  wafer  is  to  be  discarded.    A  screening  policy  is  defined  by  the  vector 

u  =  (ui,U2, ....  ul).  where  u„  =  0  for  n  >  /.   For  n  =  1 /,  the  decision  ^in  depends  on 

(xi,  ...,Xn_i)  only  through  the  sufficient  statistic  5„_i,  where  ^o  =  0.  The  profit  generated 
bv  wafer  n  is 

''  0  if  u„  =  0, 


g(xn,u„)  =  < 


(i: 

r{M  -  ir,)  -  ct    if  u„  =  1, 


where  r  is  the  revenue  received  from  a  good  chip  and  cj  is  the  variable  testing  cost  per 
wafer.  For  a  given  policy  u.  the  expected  profit  from  testing  one  lot  of  wafers  is 

L 

V{u)   =   E[Y,g(^n.Ur,)l  (2) 

n  =  \ 

and  the  expected  number  of  wafers  tested  per  lot  is 

N{u)  =  E['tunl  (3) 

n=l 

where  both  expectations  are  over  the  random  variables  /,  which  is  embedded  in  the 
definition  of //„.  and  (xi,...,j(). 

The  problem  of  finding  a  screening  policy  u  that  maximizes  (2)  is  an  optimal  stopping 
problem.  Our  problem  of  simultaneous  quality  and  quantity  control  involves  one  more 
decision  variable  and  two  extra  constraints.  The  decision  variable  is  the  lot  start  rate 
A,  which  is  the  number  of  lots  introduced  into  the  fab  per  week.  Let  the  fab's  effective 
capacity  be  ftp  lots  per  week  and  let  the  testing  facility's  effective  capacity  be  hj  wafers 
per  week.  We  assume  that  if  the  rate  of  work  entering  either  of  these  facilities  exceeds 
its  effective  capacity,  then  unacceptably  high  lead  times  and  work  in  process  inventory 
levels  will  be  incurred.  Hence,  the  two  constraints  are 

and 

XN{U)  <  fiT.  (5) 
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Our  optimization  problem  is  to  choose  the  start  rate  A  and  a  screening  pohcy  u  to 
maximize 

\{V(u)-cr)  (6) 

subject  to  constraints  (4)  and  (5),  where  Cf  is  the  variable  fabrication  cost  per  lot. 

We  conclude  this  section  with  some  assumptions  on  the  problem  parameters.  Before 
any  testing  is  performed,  the  a  priori  expected  number  of  defective  chips  on  a  wafer  is 

E[Xn]   =    -.  (7) 

a  —  I 

To  ensure  that  this  quantity  is  positive,  we  need  to  assume  that  a  >  1,  which  holds 
for  the  parameter  estimates  obtained  from  Bohn's  data  in  Section  5.  If  we  denote  the 
exhaustive  testing  policy  by  u    ,  then 

V'(u^)  =  (1  -  q)L[r{M  -  -^)  -  CtI  (8) 

a  —  [ 


and 


We  assume 


N{u^)  =  {\-q)L.  (9) 


^^  (101 


(1  -q)L 
so  that  the  testing  facility  is  the  bottleneck  under  the  exhaustive  testing  policy,  and 

r(A/-^)-cx>-^.  (11) 

a  —  I  (1  —  q)L 

so  that  exhaustive  testing  is  profitable. 

3.  The  Optimal  Fixed  Sample  Size  Screening  Policy 

Since  the  optimal  solution  (A,u)  to  problem  (4)-(6)  is  difficult  to  obtain,  we  restrict 
ourselves  in  this  section  to  a  fixed  sample  size  screening  policy,  which  is  denoted  by  u"-  . 
Under  this  strategy,  the  number  of  wafers  tested  from  a  lot  is  min{n,/}.    If  the  total 
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number  of  defective  chips  found  in  these  wafers  is  less  than  or  equal  to  B.  then  the 
remaining  wafers  in  the  lot  are  tested;  otherwise,  the  remaining  wafers  are  discarded. 

Standard  calculations  show  that 

,       r(a  +  (n-l)a)         6-^7'^"^  .. 

^^'-'^  -  r(a)r((n-l)a)(6  +  .„_,r<"-"-'  '""'  -     '  '      ' 

is  the  probability  density  function  for  the  number  of  bad  chips  on  the  first  n  —  1  wafers 

tested,  and 

^x„|s„_i    =— (13) 

a  +  (n  —  i)Q  —  1 

is  the  expected  number  of  defective  chips  on  the  nth  wafer,  given  that  Sn-i  defective 
chips  are  found  on  the  first  n  —  I  wafers.  .Also,  the  probability  that  a  lot  entering  the 
testing  facihty  has  /  wafers  is  given  by 

\ 


L 

H{1)  =  ■ 

"    / 


I'-'iL-l)".  (14) 


Hence,  the  expected  profit  per  lot  of  wafers  is 
V^K"^)    =    J2HU){rl{M-E[x^])-lcT} 

1=0 

+      X^    //(/)r{n(M-£[x„])  +  (/-n)  /       (A/ -  £[x„+,|5„])/(.sjrf.s,} 
-      ^    //(/)cT{n  +  (/-n)  /         f{sn)dsr,}  (15) 

l=n+l  - 

and  the  expected  number  of  wafers  tested  per  lot  is 

iV(t/"-^)  =  f^ //(/)/+    X^    //(0{n  +  (/-n)  /        f(Sn]dsn}.  (16) 

Thus,  problem  (4)-(6)  reduces  to 

max        A(V(u"'^)  -  c/r)  (1") 

subject  to        A  <  /zf  (IS) 

AAr(u"'^)  <  MT,  (19) 
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which  is  equivalent  to 


max  mm{fiF.fiT/y{u''-'')}{Viu'''")  -  cp).  (20) 


Since  a  closed  form  solution  to  (20)  appears  to  be  unattainable,  we  exhaustively 
enumerate  over  the  integer  values  {n.B  :  0  <  n  <  1:0  <  B  <  n.M)  to  find  the  optimal 
solution.  The  calculations  are  considerably  streamlined  by  observing  that 


V(u-^)     =     V{u--^-')+    Y.    H{l)T{l-n)l      {M-E[x^^,\sr.])f{sr,)ds^ 

L  .B 

-       Y,    H{l)cT[l-n)  /       f{Sr.)ds^  (21' 


/  =  n  +  l 

and 


A'(u-S)  =  .V(u-s-i)+    Y.    H[l)[l-n)r    f{sjds„.  (22) 

Hence,  for  each  B.  only  /g.,  £'[j„+i|5n]/(5„)(is„  and  fg_^  f(Sr,)dsn  have  to  be  calculated. 

4.  The  Optimal  Solution 

In  this  section,  a  computational  procedure  is  developed  to  solve  problem  (4)-(6).  which 
is  essentially  an  optimal  stopping  problem  embedded  within  a  mathematical  program. 
First  we  reformulate  the  problem  into  the  equivalent  two-step  maximization  problem 

max    \{V\-cp)  (23) 

0<A<MF 

where 

Vx    =    maxV(u)  (24) 

subject  to  iV(u)  <  ^.  (25) 

Proposition  1.  (A*.u*)  is  an  optimal  solution  to  problem  problem  (4)-(6)  if  and 
only  if  u'  is  an  optimal  solution  to  problem  (24)-(25)  with  A  =  A*  in  (25),  and  A*  is 
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(he  optimal  solution  to  (23).   The  optimal  objective  function  value  is  the  same  for  both 
problems. 

Proof.  If  (X\u')  is  an  optimal  solution  to  problem  (4)-(6),  then  u'  satisfies  (25) 
with  A  =  A'  and,  for  any  screening  policy  u  satisfying  this  condition,  we  have 

X'{V{u-)-cp-)>\-{V(u)-cr)  (26) 

or 

V{in  >  V{u).  (27) 

Hence,  u'  is  an  optimal  solution  to  problem  (24)-(25)  with  A  =  A'  in  (25),  and  V'(u")  = 
Vx'. 

Observe  that 

X-{V(u')-Cf)>\{V[u)-cr)  (28) 

for  all  A  and  u  satisfying  (4)  and  (5).    Fixing  A  and  maximizing  over  u  subject  to  (25) 
yields 

^'{Vx'-cf)>\(Vx-cf)  (29) 

for  all  A  such  that  0  <  A  <  f.if.    Therefore,  A*  is  an  optimal  solution  to  (23)  and  the 
optimal  objective  function  value  is  the  same  for  both  problems. 

Conversely,  if  u'  is  an  optimal  solution  to  (24)-(25)  and  A*  is  an  optimal  solution  to 
(23),  then  they  jointly  satisfy  constraints  (4)  and  (5).  For  any  other  feasible  solution 
(A,u)  to  (4)-(6),  we  have 

XiViu)  -  of)  <  A(Vx  -  cr]  <  X'{Vx'  -  cp)  =  X'(V{u')  -  c/r),  (30) 

which  implies  that  (A*,u*)  is  an  optimal  solution  to  (4)-(6),  and  the  optimal  objective 
function  value  is  the  same  for  both  problems.  I 

Let  u°  be  the  screening  policy  that  maximizes  the  function  V(u)  defined  in  (2). 
Maximizing  V(u)  is  an  optimal  stopping  problem  and  will  be  discussed  later  in  this 
section. 
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Proposition  2.  If  .\'{u°)  <  ht/i-I-f-  f/^en  the  optimal  solution  to  (23)-(25)  is  u'  =  u° 
and  \'  =  fj.f. 

Proof.  Since  the  screening  strategy  «°  maximizes  V{u)  with  no  side  constraints,  it 
also  maximizes  (24)-(25)  for  all  A  6  [0. /ir/.V(u°)].  Since  N(ii°)  <  ^ij/nr-  it  follows 
that  u°  optimizes  (24)-(25)  for  all  A  G  [0,/i/r].  By  (11).  the  exhaustive  testing  policy  is 
profitable,  and  hence  I  (ti°)  >  0  and  setting  A  =  ^ip  optimizes  (23).  I 

Thus,  when  .V(u°)  <  jij/np.  the  probing  facility  is  not  used  to  its  full  effective 
capacity,  and  the  solution  to  (4)-(6)  is  obtained  by  solving  a  single  optimal  stopping 
problem.  We  now  consider  the  more  interesting  situation  where  N(u^)  >  j.ljI i.Lf.  Since 
u°  optimizes  (24)-(25)  for  all  A  €  [0. //T/^'(»°)].  (23)  can  be  replaced  by 

max         \{\\-cf).  (31) 

mt/.V("°)<a<mf 

If  problem  (24)-(25)  can  be  solved  efficiently  for  a  given  A,  then  a  one-dimensional  search 
over  A  G  [/ir/.V(u°)./i/r]  for  the  largest  value  of  A(Va  —  cp)  will  yield  an  optimal  solution 
to  our  original  problem.  Since  (24)-(25)  is  a  constrained  optimal  stopping  problem,  we 
solve  this  problem  by  employing  a  Lagrangian  approach.  Let  7  be  the  Lagrange  multiplier 
for  constraint  (25)  and  define 


5"'(x„,{in)   =    < 


0  if  u„  =  0. 

(32) 

r(iV/-x„)-CT-7    ifu„  =  l. 


Notice  that  7  plays  the  role  of  an  additional  testing  cost,  so  that  the  total  testing  cost 
per  wafer  is  cj  +  7.  Define  the  Lagrangian  function 

Viu)     =     V{u)-tN{u) 

L 

=    £[^^(x„,u„)l-7^(u) 

n=l 

=    E[X;g-(x„,u„)],  (33) 

n  =  l 

and  consider  the  Lagrangian  problem 

maxV^(u).  (34) 

U 
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Proposition  3.  If  the  screening  policy  u'{f)  solves  the  Lagrangian  problenn  for  some 
7  >  0,  and 

vV(u-(7))  =  y.  (35) 

then  u'{f)  is  the  optimal  solution  to  problem  ('24)-(25). 

Proof.  For  any  screening  strategy  u  satisfying  (25), 

V(u)     <     V'(u)-7iV(u)  +  7^ 

<     V{u'it))-fN{u'{f))  +  f^ 

=    V'(u-(7)).    I  (36) 


Since  7  enters  the  Lagrangian  problem  as  an  additional  testing  cost,  it  is  not  hard 
to  show  that  the  optimal  objective  function  value  in  (34)  is  a  continuous  nonincreasing 
function  of  7.  The  proof  of  the  following  proposition  relies  on  this  fact  and  the  conjecture 
that  the  optimal  expected  number  of  wafers  tested  per  lot  N{u'(f))  is  also  a  continuous, 
nonincreasing  function  of  7.  Although  this  conjecture  has  been  borne  out  in  our  numerical 
study  and  seems  as  intuitively  obvious  as  the  continuity  and  monotonicity  of  the  optimal 
objective  function  value,  the  awkward  expression  for  N{u'{'y))  in  (52)  has  prevented  us 
from  providing  a  rigorous  proof. 

Proposition  4.  If  N(u°)  >  nxl I-lf-  ^^^"  there  exists  a  7G  (O.rM]  such  that  u'(^)  is 
an  optimal  solution  to  (24)-(25)  with  start  rate  \  =  ^f. 

Proof.  As  7  increases  from  0  to  rM,  V^(u*(7))  decreases  from  V{u°)  to  0,  since 
the  optimal  solution  to  (34)  is  to  discard  all  wafers  when  7  =  rM.  Similarly,  if  our 
conjecture  is  correct,  A'^(u'(7))  decreases  from  iV(u°)  to  0  as  7  increases  from  0  to  rM. 
Since  0  <  /xt/mf  <  N{u°),  there  must  be  a  7  €  (0,rM]  for  which  N(u'(^))  =  /ir/zT- 
By  Proposition  3,  u'{^)  is  an  optimal  solution  to  (24)-(25)  with  start  rate  A  =  fip-        I 

Propositions  3  and  Proposition  4  can  be  combined  to  develop  a  search  procedure 
for  solving  problem  (4)-(6).  For  fixed  A  €  [ij.t/N{u°),  hf],  we  solve  (34)  and  search  for 
that  7  for  which  ;V(u'(7))  =  ^iT/^■    Proposition  4  guarantees  the  existence  of  such  a 
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7  in  the  interval  [O.7].  Then,  we  evaluate  the  objective  function  in  (31)  and  search  for 
a  A'  that  has  the  largest  objective  function  value.  However,  the  search  over  A  can  be 
accomplished  simultaneously  as  we  search  over  7.  For  each  7  G  [0,7].  we  solve  ('i-i)  and 
let  A  =  ij.t/N{u' {-))).  By  Proposition  3,  "'(7)  is  the  optimal  solution  to  (24)-(25)  with 
this  value  of  A.  and  the  objective  function  in  (31)  can  be  evaluated.  Since  for  every 
A  G  [^t/-V(u°).^f]  there  exists  a  7  in  the  interval  [0,7]  such  that  .V(u'(- ))  =  /ij/-^- 
every  A  G  [^r/(l  ~  <7)-^-/'f]  is  searched  as  all  7  G  [0,*/]  are  searched.  Thus,  one  search 
over  7  G  [O.7]  is  sufficient  to  find  the  optimal  start  rate  A'  to  (31)  and  the  optimal 
screening  policy  u'  to  (24)-(25).  Readers  can  find  an  outline  of  this  algorithm  at  the 
end  of  this  section. 

We  now  focus  on  solving  the  Lagrangian  problem  (34).  Let 

1 
be  the  probability  that  a  lot  has  more  than  /  wafers,  and  let 

n      I         ^  r(a  +  na)  (6  +  3„-i)"^'"-"-(r„)-^ 

-^(""1^''-'  =  r(a)r(a  +  (n-l)a)       (6 +  .„.,  + xj-'^^       '  ""  "  ^'  ^^^^ 

denote  the  posterior  probability  density  for  the  number  of  bad  chips  on  the  nth  wafer. 

given  that  Sn-i  bad  chips  are  found  on  the  first  n  -  1  wafers.    If  V'^*'(5n)  represents  the 

expected  profit  obtained  from  wafers  n  +  1,...,I.  given  that  Sn  bad  chips  were  detected 

on  the  first  n  wafers,  then  u'[-))  and  V"'(u*(7))  can  be  found  by  solving  the  dynamic 

programming  equations 

V^{sl)    =    0  (39) 

/•oo 

V„''(5„)     =    max{0,G(n)  /     [r{M  -  x^+i)  -  ct  -  f  +  l^+.isn  +  Xn+i)\f{Xn+i\sn)d.r^+,}. 

Jo 

n  =  L-l 1.  and  (40) 

/•oo 

V^(u)     =    ^(7(0)  =  max{0,G(0)  /     [r{M  -  i^)  -  ct  -  1  +  V^''{xi)]f(xi)dxi}.  (41) 

After  n  wafers  have  been  tested,  we  can  discard  the  remaining  wafers  and  obtain  no  profit. 
If  the  lot  has  more  than  n  wafers,  then  we  can  continue  testing;  if  wafer  n  +  1  contains 
j„+i  bad  chips,  then  the  immediate  profit  is  r{M  -  J„+i)  -07-7  and  the  expected 
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future  profit  is  Vn+ii^n  +  -^n+i)-  These  equations  also  reveal  structural  properties  of  the 
optimal  solution  to  (34),  which  are  discussed  in  the  two  propositions  below. 

Proposition  5.  The  optimal  policy  u'{f)  =  (ui(7), ...,  u^(-, ))  is 

u\  =  \        if  5„_i  <B;:_i,  (42) 

u-„  =  0        if5„_i  >B;:_i,  (43) 

where  the  stopping  boundary  B^  >  — l,n  =  1,...,L,  and  5^_i  =  —1  indicates  that  wafer 
n  is  not  tested  under  any  circumstances. 

Proof.  We  only  need  to  show  that  V'„"'(5n)  is  nonincreasing  in  Sn,  which  is  done  by 
a  backward  induction  on  n.  It  is  trivially  true  for  n  =  I.  Suppose  it  is  true  for  n  +  1, 
and  consider  the  difference  V^isn  +  1)  —  ^ni^n)-  In  order  to  prove  that  this  quantity  is 
nonpositive,  the  following  properties  of  the  conditional  density  /(jn+i|5„)  are  required. 
For  n  =  1, .  .  . ,  I  —  1  and  .s„  >  0,  there  exists  Xn+i  >  1  such  that 

/(x„+i  -  1|5„  +  1)     >     /(j„+i|5„),  for  Xn+i  >  Xn+i,  and  (44) 

/(j^+i  -  l|.s„  +  1)     <     /(x„+i|5„),  for  Xn+i  <  x„+i.  (45) 

These  inequalities  can  be  verified  using  (38).  By  (40),  it  suflfices  to  consider  the  difference 

/     [(r(M  -  x„+i)  -CT-1  +  V;:+,{sn  +  1  +  X„+,)/(x„+i|.^„  +  l)(fx,+i] 
Jo 

/•CO 

-  /       [{r{M  -  X„+i)  -  Cj  -  7  +  V;\i(5„  +  Xn+l)f{^n+l\^n)dXn+i] 

Jo 

= -+  K\i(5„+   1  +Xn  +  i)/(x„+i|s„+   l)(fXn  +  l 

a  +  na  —  1       Jo 

roo 

Jo 

roo 

<        /       [KVl(^n  +  1  +  -C„+l)   -  KT+JSn  +  X„+i)]/(x„+i|5„  +  l)c?X„+i 

-  /"[V„'Vi(5„+x„+i)-V;'Vi(5„  +  x„+i)]/(x„+,|5„)(fx„+i.  (46) 
Jo 

Changing  the  integration  variable  in  the  first  integral  from  x„+i  to  x„+i  +  l  and  combining 
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/' 


the  two  integrals  in  (46),  we  get 

[Vn  +  liSn  +  ^n+l)   -  V;\  ,  (  5,   +  X„  +  i  )]  [/(x„  + i    -   l|s„+    1)   -  /(  X„+ ,  |.S  J](fx,  +  , 

-/     [Vn'+l('Sn  +  -Tri  +  l)  -   ^n''+l('^"  +  •^■n+l)]/(-rn  +  l|-«n)^-rn  +  l-  (47) 

.'0 

The  two  terms  inside  the  first  integral  have  opposite  signs  by  (44)-(45)  and  the  induction 
hypothesis;  hence,  the  first  integral  is  nonpositive.  The  second  integral  is  nonnegative 
because,  by  the  induction  assumption,  V„\i(.s„  +  j-„^,)  >  V'„\i(5„  +  x„+i )  for  0  <  Xn+\  < 
1  <  Xn+\-  Therefore,  (47)  is  nonpositive,  and  the  induction  is  verified.  | 

The  following  proposition  establishes  monotonicity  of  the  optimal  stopping  boundary, 
and  is  used  to  streamline  the  dynamic  programming  algorithm.  The  proof  is  similar  to 
the  proof  of  Proposition  5,  and  is  omitted. 

Proposition  6.   The  optimal  stopping  boundary  satisfies 

Bo<B]<...<B2_j.  (48) 

The  dynamic  programming  equations  (39)-(41)  involve  L  functions  of  continuous 
variables.  In  the  numerical  computations,  we  discretize  the  continuous  variables  and 
approximate  the  integrals  by  finite  summations.  Two  observations  are  helpful  in  reducing 
the  amount  of  computation.  First,  the  final  boundary  point  can  be  explicitly  derived, 
and  equals 

^2-1  =  (a  +  (L-l)Q-l)(A/-(7  +  CT)/r)/a-6.  (49) 

Also,  since  Ki(5„)  is  nonincreasing  in  5„,  we  calculate  V^{sn)  starting  from  5„  =  0,  and 
if  Sn  is  found  such  that  K^l^n)  =  0,  then  we  set  V;(x)  =  0  for  all  x  €  (sn,nM]. 

After  the  optimal  solution  u'(f)  to  the  Lagrangian  problem  is  derived,  we  need  to 
determine  N{u'{f)),  which  is  the  expected  number  of  wafers  that  are  tested  per  lot. 
Notice  that  the  optimal  boundary  point  5o  =  -1  or  0.  If  BJ  =  -1,  then  the  screening 
poUcy  u'{f)  cannot  be  optimal  for  the  original  problem  (4)-(6),  by  (11).  If  Bq  =  0.  then 

Cn=  /'.../  "   f{Sn-Sn-xK-^)..-f{si)dSn...ds„  n  =  !,...,! -1,         (50) 

Jai=0  Js„=0 
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and  the  probability  that  testing  ceases  after  the  nth  wafer  is 

Tn  =  C„_i  -Cn,  n  =  1,...,I-  1.  (51) 

where  Co  =  1  •  Then  the  expected  number  of  wafers  tested  per  lot  is 

N{n  {-,))  =  Y.^T^^L{\-Y^T^).  (52) 

n=l  n=l 

We  conclude  this  section  with  an  outline  of  the  algorithm  that  solves  the  original 
optimization  problem  (4)-(6). 

Algorithm: 

Step  1.  Let  7  =  0  and  set  7'  =  0. 

Step  2.  Find  the  optimal  solution  "'(7)  to  the  Lagrangian  problem  (34)  and  the 
optimal  objective  function  value  V~'{u'{~^)).  If  the  optimal  boundary  Bq  —  —1,  then 
stop.  The  optimal  start  rate  is  A^*  and  the  optimal  screening  policy  is  u'(7'). 

Step  3.  Calculate  N{u'{~i))  using  (52),  and  define  A^  =  ^itIN[u'{-i)). 

Step  4.  Compute  the  objective  function  value  for  the  original  problem, 

P,  =  y[V\u{',))  +  7vV(u-(7))  -  cf\.  (53) 

If  P-,  is  the  maximum  over  all  P^'s  calculated  thus  far,  then  let  7'  =  7.    Change  7  to 
7  +  (5,  where  6  is  a  small  step  variable,  and  go  to  step  2. 

Notice  that  the  algorithm  is  guaranteed  to  terminate,  since  Bq  =  —\  when  7  —  rM. 
If  we  could  prove  that  P^  is  concave  in  7,  then  a  binary  search,  rather  than  an  exhaustive 
search,  over  7  €  [0,rM]  could  be  employed,  thereby  saving  a  considerable  amount  of 
computation.  For  all  five  data  sets  considered  in  the  next  section,  P-,  is  indeed  concave 
with  respect  to  7.  Although  we  have  been  able  to  prove  that  the  optimal  value  function 
V'^{u'{-^))  is  decreasing  and  concave  in  7,  we  have  been  unable  to  prove  the  concavity  of 
P-yl  our  obstacle  is  again  the  expression  for  N{u'{'^))  in  (52). 
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5.  Numerical  Results 

In  this  section,  we  test  the  optimal  fixed  sample  size  screening  strategy  and  the 
optimal  sequential  screening  strategy  on  five  sets  of  yield  data,  where  each  set  has  about 
10  lots  and  each  lot  has  less  than  25  wafers.  The  data  sets,  denoted  by  Cl,Cl -5,02.02. 5 
and  C3,  were  obtained  by  Bohn  from  the  same  factory  producing  the  same  product  in 
five  different  time  periods.  For  each  of  the  five  data  sets,  maximum  likelihood  estimation 
is  used  to  obtain  values  of  the  gamma  parameters  a,  a  and  6.  More  specifically,  the 
following  procedure  is  followed  for  each  data  set.  If  a  data  set  contains  m  lots,  then  the 

maximum  likelihood  estimates  (qi,  ....q^)  and  {,3i ,3m)  are  obtained  from  the  number 

of  defective  chips  on  each  wafer  in  the  set.  We  estimate  a  by  d,  which  is  the  median  of 

(Qi,...,Qm).  and  then  recompute  the  estimates  (i3i 3m)  by  assuming  that  the  number 

of  defective  chips  on  each  wafer  in  lot  k  is  a  gamma  random  variable  with  known  shape 
parameter  a  and  scale  parameter  3^.  Finally,  the  revised  estimates  (/^i, ...,  Jrn)  are  used 
to  obtain  maximum  likelihood  parameter  estimates  a  and  b.  These  parameter  estimates 
a.  a  and  b  are  not  reported  here  for  reasons  of  confidentiality.  We  also  performed  the 
identical  estimation  procedure,  but  chose  a  to  be  the  mean,  rather  than  the  median,  of 
(ai,  ...,am);  the  profitability  results  for  this  case  were  quite  similar  to  the  results  obtained 
from  the  original  procedure  and  are  omitted. 

As  mentioned  earlier,  we  assume  that  when  the  the  probing  facility  is  working  at  its 
effective  capacity  under  the  exhaustive  testing  policy,  the  fab  is  working  at  90%  of  its 
effective  capacity.  The  wafer  scrap  rate  is  5%,  the  variable  probing  cost  per  wafer  is  3%  of 
the  variable  fabrication  cost  per  wafer  and  the  revenue  from  a  wafer  containing  all  good 
chips  is  10  times  the  wafer's  variable  production  cost.  These  parameter  values  are  based 
on  discussions  with  a  variety  of  semiconductor  managers  and  engineers,  and  are  used  to 
derive  the  optimal  fixed  sample  size  strategy  and  the  optimal  sequential  screening  strategy 
for  each  of  the  five  data  sets.  Both  screening  policies  vary  little  over  the  five  data  sets,  and 
Figure  3  illustrates  the  two  policies  for  data  set  Cl.  The  stopping  boundary  characterizing 
the  optimal  sequential  screening  policy  is  nearly  linear,  but  slightly  convex,  for  every  data 
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set,  and  the  slope  increases  with  increased  lot-to- lot  variation.  Since  the  slope  determines 
the  acceptable  yield  level,  as  the  lot-to-lot  variability  increases,  more  testing  is  required 
before  discarding  the  remaining  wafers  in  the  lot.  The  average  acceptable  yield  level  over 
the  five  data  sets  is  13.7%  lower  than  the  overall  average  yield.  The  optimal  fixed  sample 
size  policy  samples  either  )r  6  wafers  from  every  lot  in  each  data  set,  and  requires  a 
slightly  higher  yield  level  to  continue  testing  than  the  optimal  sequential  policy.  Since 
the  fixed  sample  size  policy  stops  monitoring  yield  after  a  lot  is  considered  acceptable 
while  the  sequential  screening  strategy  monitors  yield  continuously,  it  is  not  surprising 
that  the  former  only  accepts  lots  of  expected  higher  yield  level. 
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Figure  3.  Optimal  screening  policies  for  data  set  Cl. 

Recall  that  the  strategy  commonly  used  in  industry  is  to  perform  exhaustive  testing 
and  to  choose  the  start  rate  so  that  the  testing  facility  works  at  its  effective  capacity. 
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Before  reporting  our  profit  results,  it  is  useful  to  determine  an  upper  bound  on  the  profit 
increase  that  can  be  achieved  relative  to  this  straw  strategy.  Let  A^  =  ^.ltI{\  —  q)L 
denote  the  arrival  rate  under  the  exhaustive  testing  strategy,  and  let 

M  -  ^ 

y  =  -V^  (54, 

denote  the  average  incoming  yield.  Then  an  upper  bound  on  the  relative  profit  increase 

for  any  strategy  is 

X'{V{u')-CF)-XEiV{n^)-CF)     __     X'-\^       \^(V(u')-V{u^)\ 


\^{V{u^)-cf)  A^  \E\     V{u^)-cf    j 


A^  \^\rL[\  -q)My-  Lcj  -  cf 

^     MF-  A^       fiF  (  Lct(1  -  y] 

\E        +  A^  \rL{l  -  q)My  -  Lcj  -  Cf  f  '  *'" 

Since  we  assumed  that  Lcj  =  0.0.3cf .  rLAf  =  lOcf ,  q  =  0.0.5  and  the  fab  is  at  90%  of  its 
effective  capacity  under  the  exhaustive  strategy,  the  upper  bound  equals 

1    ,    lO(OM{l-y)\ 
'^^--9^Y[9.5y-im)-  ^''^ 

This  quantity  increases  from  1/9  when  yield  is  100%  to  oo  as  yield  approaches  the  critical 
level  of  10.58%  required  for  profitability  in  (11).  Although  the  exact  average  yield  cannot 
be  revealed,  the  yield  was  greater  than  36%  for  all  five  data  sets,  and  hence  AF^ax  is 
between  11.1%  and  12.0%. 

Under  the  heading  "Theoretical  Calculations",  Table  I  reports  the  profit  increases 
relative  to  the  exhaustive  testing  strategy  obtained  by  the  two  proposed  strategies  for 
all  five  data  sets.  We  also  display  pF  =  ^Ip^f  and  pj  =  {\N(u)/ pj),  which  represent 
the  effective  capacity  utilization  of  the  fab  and  testing  facility,  respectively.  It  can  be 
seen  that  for  every  data  set,  both  facilities  work  at  their  effective  capacity.  However,  the 
profit  increcLses  are  rather  small:  out  of  a  potential  11.1%  to  12.0%  increase,  only  a  2% 
to  3%  increase  is  achieved.  Also,  the  difference  in  performance  between  the  two  proposed 
policies  is  relatively  small;  the  fixed  sample  size  policy  averages  a  2.24%  profit  increase 
over  the  five  data  sets,  compared  to  2.51%  for  the  optimal  strategy. 
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Table  1.  Numerical  results. 


Data  Set 

Theoretical  Calculations 

Simulation  R( 

?sults 

Fixed  Sample  Size     Sequential 

Fixed  Sample  Size 

Sequential 

AP 

1.80                       2.09 

1.23 

1.37 

CI 

PF 

1. 000                     1.000 

1.000 

1.000 

PT 

1.000                     1.000 

1.000 

1.000 

AP 

2.18                      2.43 

1.04 

-4.02 

C1.5 

PF 

1.000                     1.000 

1.000 

1.000 

Pt 

1.000                     1.000 

1.000 

0.948 

AP 

2.99                     3.27 

0.70 

-1.90 

C2 

PF 

1.000                    1.000 

1.000 

1.000 

PT 

1.000                    1.000 

0.989 

0.958 

AP 

2.15                     2.43 

0.75 

0.90 

C2.5 

PF 

1.000                     1.000 

1.000 

1.000 

PT 

1.000                    1.000 

1.009 

1.007 

AP 

2.08                     2.35 

2.45 

0.14 

C3 

PF 

1.000                    1.000 

1.000 

1.000 

PT 

1.000                    1.000 

0.999 

0.971 

AP:  percentage  profit  increase  over  exhaustive  testing  strategy 
Pf'-  utilization  of  effective  capacity  of  the  fab 
Pj:  utilization  of  effective  of  the  testing  facility 


Since  we  do  not  know  the  order  in  which  the  wafers  in  each  lot  of  the  five  data  sets 
were  actually  tested,  it  is  difficult  to  test  our  proposed  strategies  directly  on  the  actual 
data.  Therefore,  we  reverted  to  simulation,  and  generated  100  lots  from  each  sample 
lot  by  randomly  shuffling  the  wafers  in  the  lot.  For  each  data  set  and  each  screening 
poUcy,  the  average  number  of  good  chips  obtained  per  lot  and  the  average  number  of 
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wafers  tested  per  lot  were  recorded.  These  quantities  and  the  theoretically  calculated 
start  rates  were  then  used  to  calculate  the  profit  increases  that  are  reported  in  Table  I 
under  the  heading  "Simulation  Results". 

When  the  derived  policies  are  tested  on  the  actual  data,  two  undesirable  things  can 
occur.  If  the  yield  model  underestimates  the  number  of  discarded  wafers,  then  the  testing 
facility  is  underutilized  and  a  feasible,  but  suboptimal,  strategy  is  obtained.  If  the 
yield  model  overestimates  the  number  of  discarded  wafers,  then  the  testing  facility  is 
overutilized,  and  an  infeasible  strategy  can  result.  Referring  to  Table  I,  we  see  that  the 
yield  model  correctly  predicts  the  average  number  of  discarded  wafers  per  lot  under  the 
fixed  sample  size  policy  for  three  of  the  five  data  sets,  and  is  off  by  about  1%  on  data 
sets  C2  and  C2.5.  However,  the  yield  model  is  less  accurate  under  the  sequential  policy, 
underestimating  the  average  number  of  discarded  wafers  per  lot  by  .3-5%  in  three  of  the 
five  data  sets.  In  these  cases,  the  resulting  profit  is  sometimes  less  than  under  exhaustive 
testing.  Both  policies  overestimated  the  number  of  discarded  wafers  in  data  set  C2.5  and 
the  resulting  strategy  is  not  feasible:  hence,  the  profit  increases  reported  for  this  data  set 
correspond  to  a  reduced  start  rate  that  maintains  feasibility.  That  is,  the  profit  increase 
of  0.75  (0.90,  respectively)  was  achieved  by  reducing  the  start  rate  so  that  pf  =  0.991 
(0.993,  respectively)  and  pr  —  1.000.  The  average  profit  increase  over  the  five  data 
sets  for  the  fixed  sample  size  strategy  is  1.23%  in  the  simulation  study,  about  1%  below 
the  corresponding  improvement  achieved  in  the  theoretical  calculations.  The  sequential 
strategy  averages  a  0.70%  profit  decrease  relative  to  the  straw  strategy,  because  of  the 
underutilization  of  the  testing  facility  in  cases  Cl.5  and  C2.  Hence,  in  addition  to  being 
eaisier  to  derive  and  to  implement  than  the  sequential  strategy,  the  fixed  sample  size 
strategy  performs  nearly  as  well  in  the  analytical  calculations,  and  appears  to  be  more 
robust  in  our  limited  simulation  study. 

As  a  point  of  reference,  we  also  considered  the  beta-binomial  yield  model,  where  the 
number  of  bad  chips  on  each  wafer  is  modeled  as  a  binomial  random  variable.  This 
yield  model  significantly  overestimated  the  average  number  of  wafers  tested  per  lot:  the 
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average  value  over  the  five  data  sets  of  pT  vinder  the  sequential  strategy  in  the  simulation 
study  was  only  0.861,  which  led  to  an  average  profit  decrease  of  11.68%. 
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